SCHRODER'S PROBLEMS AND SCALING LIMITS OF RANDOM 

TREES 

JIM PITMAN AND DOUGLAS RIZZOLO 

Abstract. In his now classic paper [T7], Schroder posed four combinatorial problems about 
the number of certain types of bracketings of words and sets. Here we address what these 
bracketings look like on average. For each of the four problems we prove that a uniform 
pick from the appropriate set of bracketings, when considered as a tree, has the Brownian 
£NJ ' continuum random tree as its scaling limit as the size of the word or set goes to infinity. 
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1. Introduction 



q/ In his now classic paper [17] . Schroder posed four combinatorial problems about bracket- 

ings of words and sets: how many binary bracketings are there of a word of length nl how 
many bracketings are there of a word of length n? how many binary bracketings are there of 
a set of size n? and how many bracketings are there of a set of size n? These questions are 
well studied and [TS] gives a good account of the solutions. In this paper we are concerned 
with a probabilistic variation on these questions: for each of the above questions, if you 
select a bracketing uniformly at random what does it look like? To answer these questions, 
we will use the well known correspondence of bracketings described above to various types 
of trees. We will then apply Aldous's theory of continuum trees, originally developed in the 
series [H [2j [3] and subsequently studied by many authors, to study the scaling limits of these 
trees. Let us briefly describe the correspondence between bracketings and trees. 

The first problem: The correspondence is best illustrated by example. For n = 4 the 
binary word bracketings are 

(xx)(xx) x(x(xx)) ((xx)x)x x((xx)x) (x(xx))x. 

A binary bracketing of a word with n letters corresponds to rooted ordered binary tree 
with n leaves in a natural way. This is most easily described if we put brackets around 
the entire word and each letter, which are left out of our example because they are visually 
cumbersome. The tree corresponding to a bracketing is constructed recursively. A single 
bracketed letter is a leaf. For a word with more than one letter, the bracketing of the whole 
word is the root. Attached as subtrees to the root are, in order of appearance, the trees 
corresponding to the maximal proper bracketed subwords. For n = 4, this is illustrated by 
Figure [TJ 

It is worth noting that these trees are in bijection with rooted ordered trees with n vertices, 
but this correspondence is not as natural as the one above. 
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(xx)(xx) x(x(xx)) ((xx)x)x x((xx)x) (x(xx))x 

FIGURE 1. Binary word bracketings and rooted ordered binary trees for n = 4 

The second problem: General word bracketings are defined similarly to binary word 
bracketings and correspond to rooted ordered trees with n leaves and no vertices with out 
degree equal to one. 

The third problem: The trees associated to binary set bracketings are constructed 
similarly to those associated to binary word bracketings. They are rooted, unordered, leaf- 
labeled binary trees. Figure [2] shows a sample of the correspondence for n = 4 (for n = 4 
there are 15 bracketings, so showing the whole correspondence is unwieldy). 
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Figure 2. Binary set bracketings and rooted unordered leaf-labeled binary 
trees for n = 4 

The fourth problem: General set bracketings are defined similarly to binary set brack- 
etings and correspond to rooted unordered leaf-labeled trees with n leaves and no vertices 
with out degree equal to one. In the literature, these trees are also called fragmentation trees 
[9J and hierarchies [TJ. The correspondence for n = 3 is in Figure [3j 

Scaling limits of uniform picks from the trees appearing in the first and third problems are 
well studied. A uniform pick from rooted ordered binary tree's with n leaves has the same 
distribution as a Galton- Watson tree with offspring distribution £ = £2 = 1/2 conditioned 
to have In — 1 vertices. Thus it falls within the scope of the results in [3]. Similarly, a 
uniform pick from rooted unordered leaf-labeled binary tree's with n leaves is a uniform 
binary fragmentation tree with n leaves, and scaling limits of these are studied in [9J. In 
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Figure 3. Set bracketings and rooted unordered leaf-labeled trees for n = 3 

this paper we present a unified approach that is able to handle all four of these types of 
trees simultaneously. Our method is essentially to link the trees appearing in these four 
problems to Galton- Watson trees conditioned on their number of leaves. Scaling limits 
for these Galton- Watson trees were recently proven in [TB]. We also draw a connection to 
certain Gibbs fragmentation trees, which were originally studied in [12]. Our main result is 
the following theorem, the notation for which will be fully explaned later. 

Theorem 1. For i = 1,2,3,4, let T^ be a uniform random tree of the type appearing in 
Schroder's i 'th problem with n leaves. For each i and n equip T^ with the graph metric where 
edges have length one and the uniform probability measure on its leaves. We then have the 
following limits with respect to the rooted Gromov-Hausdorff-Prokhorov topology: 

(i) -Lr'A2V2T Br (ii) l-TlA^^=T Br 



-i Br 



n wn 



(ni) ^A2V2T Br (iv) -^T n 4 4 
in \/n 



/41og(2) 



where T Br is the Brownian continuum random tree. 



As noted above, parts (i) and (hi) were originally proven in [3] and [9] respectively. Parts 
(ii) and (iv) appear to be new. 

The paper is organized as follows. In Section [2] we rigorously introduce the models of 
random trees under consideration here. In Section [3] we introduce the analytic setting for 
Theorem [T] and end with the proof of this theorem. Finally, in Section H] we use elementary 
methods from analytic combinatorics to compute some asymptotic properties of these trees 
explicitly. 

2. Combinatorial models and Galton- Watson trees 

In this section we develop several combinatorial and probabilistic models of trees. There 
are two primary types of trees we will be dealing with in the sequel: rooted ordered unla- 
beled trees and rooted unordered leaf-labeled trees. Combinatorial relations between rooted 
ordered unlabeled trees and rooted unordered labeled trees are well known when the size of a 
tree is its number of vertices (se e.g. [151 13 13 E]). In this section we develop analogous rela- 
tions when the size of a tree is its number of leaves. Particularly important for us is Corollary 
[21 which relates Schroder's problems to particular Galton- Watson trees conditioned on their 
number of leaves. 
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We briefly give an account of the formal constructions of the trees we will be considering. 
Fix a countably infinite set S; we will consider the vertex sets of all graphs discussed to be 
subsets of S. Let T n denote the set of rooted unordered trees with n leaves (where the root is 
considered a leaf if and only if it is the only vertex in the the tree) whose leaves are labeled 
by {1,2, ... ,n}. More precisely, we consider the set 7^ of all trees whose vertex sets are 
contained in S that have a distinguished root and n leaves (where the root is considered a 
leaf if and only if it is the only vertex in the the tree), whose leaves are labeled by {1,2, ... ,n} 
and set T n = 7^ / ~ where t ~ s if there is a root and label preserving isomorphism from t 
to s. This is the only time we shall go through this formal construction, but all other sets of 
trees we discuss should be considered as formally constructed in an analogous fashion. We 
also let T = U„>i7^. We let Tn be the set of rooted ordered unlabeled trees with n leaves 

andr(°) = u n >^rn (0) . 

We will be proving analogous results for trees in T and T^°\ so analogous that the only 
difference in the statements will be the superscript (o). To avoid repetition we will use T* 
and T* when we don't want to specify whether we are in T, T^°\ T n or T n ■ That is, in a 
given result you may replace all the *'s by nothing or (o). For a tree t G T* , we define \t\ to 
be the number of leaves in t and j^t to be the number of vertices in t. 

2.1. Probabilities on trees. Let ( = (Ci)«>o be a sequence of numbers. We may then 
define the weight of a tree t G T* to be 

vet 

Here and throughout, deg(t> ) is the out degree of v, i.e., the number of children of v . We will 
assume the following conditions: 

Condition 1. (i) Q> for all i, (ii) ( > 0, and (Hi) for each n we have YlteT* w c(t) < °°- 

Observe that this condition is satisfied whenever d = 0, as is the case for Schroder's 
problems. For each n such that w^(t) > for some t G T* we may define a probability 
measure on T* by 



Qn*(t) 



We wish to consider generating functions, but we want an ordinary generating function for 
T^ and an exponential generating function for T. In order to do this all at once, for z <E C, 
we define y n (z) = z n /n\ and y n ° (z) = z n , both for n > 0, and we use y* n in the same fashion 
as T*. The weighted generating function induced on T* by ( with the weights defined above 
is 

q(*) = X>c(t)vft(*)- 

teT* 
Letting G^^(z) = Y^Li CiDi( z )i is then easy to see that C^ satisfies the functional equation 

(2.1) C* c (z)=Coz + G c ,.(Cc(z)), 

in the sense of formal power series. Our interest is in the measures Q n and, in particular, 
we would like to find a Galton- Watson tree T such that Q£ is the law of T conditioned to 
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have n leaves. Recall that if (£i)i>o is a distribution on Z + with mean less than or equal to 
one and £o > 0, a Galton- Watson tree with offspring distribution £ is a random element T 
of T (o) with law 

•uet 
T is called critical if £ has mean equal to one. This leads to the notion of tilting, which 
is similar to exponential tilting for Galton- Watson trees conditioned on their number of 
vertices. 

Proposition 1. Suppose that ( satisfies Condition^ and suppose that a,b > 0. Define ( by 

Co = a( and Q = tf^Q for i>\. 

ThenQi* = Qi* for all n > 1. 

Proof. This follows immediately from the computation that, for t G T*, wM) = a n b n ~ 1 w^(t). 

D 

The above result is the equivalent of exponential tilting for trees conditioned on their 
number of vertices. A consequence of this is that we can find a Galton- Watson tree T such 
that Qn is the law of T conditioned to have n leaves if we can find a, b > such that 

a(, H — 1. 

Furthermore, T will be critical if G^( \'(b) = 1. An immediate consequence of this is the 
following corollary. 

Corollary 1. Let £" = (ft)^ be the probability distribution defined by 
ft = 2{3 2 ~_ ^ « 0.5858, ^ = 0, and ft = ( ^y^ J « (0.2929)'- 1 /or z > 2. 

A^oie that £ u /ias mean 1 and variance Ay/2. Let T be a Galton- Watson tree with offspring 
distribution £". Then the law of T conditioned to have n leaves is uniform on the subset of 
Tn of trees with no vertices of out degree one. 

Proof. The proof follows immediately from the discussion above by noting that, if Q = 1 for 
i / I and £i = then then Qn is uniform on T n ■ Explicitly, the distribution £" is found 
by solving Gy ( )(») = 1, setting a = {b — G^ )(b))/b, and tilting as in Proposition [TJ □ 

Given the similarities in the construction of Q n and Q n , there should be a natural way 
to go back and forth between them. 

Proposition 2. Suppose that ( satisfies Condition [1\ for * = (o). Define ( by ( n = n\( n . 
Suppose thatT is distributed like Qn and let U be a uniformly random ordering o/{l, . . . , n} 
independent of T . Define T €zl~ n to be the tree obtained from T by labeling the leaves of T 
by U and forgetting the ordering ofT. Then T is distributed like Q n . 
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Results of this type connecting plane and labeled trees where the size of a tree is given by 
the number of its vertices can be traced back to [10l [131 H3J- See [15] for a more complete 
history. Our proposition is analogous to an implicit discussion in [TJ [2] as well as Theorem 
7.1 in [T5], which considered the case where the size of a tree is given by the number of its 
vertices. To prove this proposition, we will need some notation. For a rooted ordered tree x 
let shape (x) be the rooted unordered tree obtained by forgetting the order on x. Similarly, 
for t G T, shape(t) is defined to be the rooted unlabeled tree obtained from forgetting the 
labeling of t. For t G T, x G T^°\ and a rooted unordered tree y define #labelsj(x) to be the 
number of ways to label the leaves of x such that when you forget the order on x you get t and 
#ordered(y) to be the number of ordered trees whose shape is y. Observe that #labels t (x) 
depends only on shape(x), so we will abuse our notation and write #labels t (shape(x)). 

Proof. Observe that 

P(f = t)= Y^ V(T = x)F(f = t\T = x). 

zeT„ (o) 

Furthermore, observe that 

P (f = t \ T = x ) = #label Si (shape(x)) 

n\ 

Observe that #labels t (shape(a;)) = unless shape(t) = shape(x). Furthermore, P(T = x) 
depends only on shape(x), and is given by 

p( T = x )= n*>gshape(:r) Cdeg(^) 

Consequently we have 

f00 , Wff , . #ordered(shape(t)) n, gshapc(t) Cde g (,) ggggg 

\l.l) iryl — t) — 

But 

(2.3) #ordered(shape(t))#labels t (shape(t)) = ][[ (deg(v)!). 

t)£shape(t) 

This is because both sides count the number of distinct leaf-labeled ordered trees that equal 
t upon forgetting their order. On the left hand side, you pick a ordered tree and the label 
it and, on the right hand side, you label an unordered tree with the appropriate shape and 
then order the children of each vertex. 
Therefore we have 



P(f = t) 



W£(t) 



n\ £ s6Tn e» w c (s) • 
The last step is to observe that 

Yl w <^ = ^2 w d s ^ 



nl 



ser„ (o) ser ™ 
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This is because for s e 7n , there are n\ rooted ordered leaf-labeled trees whose ordered 
tree is s upon forgetting the labeling, so the left hand side is the weighted number of rooted 
ordered leaf-labeled trees with n leaves. Furthermore, we have already noted above that 
for s e T n , there are n^es^sW) ro °t e d ordered leaf-labeled trees whose labeled tree is 
s upon forgetting the ordering. Thus the right hand side is also the weighted number of 
rooted ordered leaf-labeled trees with n leaves. Note that this step also shows that C satisfies 
Condition [1] for * being nothing. □ 

Combining with tilting, we have the following corollary. 

Corollary 2. Let ( satisfy Condition^ with * being nothing and Co = 1- Suppose there exist 
r > and s > satisfying s = r + G%(s) and G'^(s) < 1. Define £ = (£t)£2. by £o = rs 1 
and £j = si~ x (,j/j\ for j > 1. Note that £ is a probability distribution on Z+. Let T be a 
Galton- Watson tree with offspring distribution £ and construct T by labeling the leaves of T 
uniformly at random with {1, . . . , \T\}, independently of T . Then P(T G -||T| = n) = (j£(-) 
for all n > 1 such that Q^ is defined. Furthermore, for n such that Q^ n is not defined, 
P(\T\=n) = 0. 

2.2. Schroder's problems. In this section we record which of the trees above correspond 
to the trees that appear in Schroder's problems. The proofs of the claims here are simple 
applications of the results in Section 12.11 

The first problem: The trees here are uniform binary rooted ordered unlabeled trees. 
We can obtain these by taking * = (o) and Co = C2 = 1 and Q — for i ^ {0, 2}. Letting £ 
be the probability distribution given by £0 = £2 = 1/2 and T be a Galton- Watson tree with 
offspring distribution £, we have that T conditioned to have n leaves is a uniform binary 
rooted ordered unlabeled tree with n leaves. Also note that T is critical and the variance of 
£ is equal to one. 

The second problem: These are uniform rooted ordered trees with no vertices of out 
degree one. These were dealt with in Corollary [1] 

The third problem: These are uniform binary unordered leaf-labeled trees. We can 
obtain these by taking * to be nothing and Co = C2 = 1 and Ci = for i <£ {0, 2}. In this case, 
if T is the Galton- Watson tree defined in the first problem and T is defined as in Corollary 
[21 then T conditioned to have n leaves is a uniform binary unordered leaf-labeled tree with 
n leaves. 

The fourth problem: These are uniform rooted unordered leaf-labeled trees with no 
vertices with out-degree 1. We can obtain these by taking * to be nothing and Ci = and 
Ci = 1 for i 7^ 1. We define a probability distribution £ by 

21og(2)-l (log(2))i- 1 t 
& = i og (2) ' ^ 1= °' and ^' = J\ for ^> 2 - 

Note that £ has mean 1 and variance var(£) = 2 log 2. Letting T be a Galton- Watson tree 
with offspring distribution £ and defining T is as in Corollary [2j we have that T conditioned 
to have n leaves is a uniform unordered leaf-labeled tree with no vertices of out degree one 
and n leaves. 
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2.3. Gibbs trees. Above we saw a natural way to put probability measures on T n that 
are concentrated on fragmentation trees (the trees appearing in Schroder's fourth problem); 
namely, take (i = 0. Another natural type of probability to put on fragmentation trees is 
a Gibbs model, which we now describe. First, we need to set up the natural framework in 
which to view fragmentation trees. The idea is that, while in Schroder's fourth problem 
we have an arbitrary set bracketing, for fragmentations we recursively partition a set. This 
dynamic view of constructing a set bracketing makes Gibbs models quite natural. 

Definition 1 ([12]). A fragmentation of the finite set B is a collection t B of non-empty 
subsets of B such that 

(1) Bet B 

(2) // j^B > 2 then there is a partition of B into k > 2 parts B\, . . . ,Bk, called the 
children of B, such that 

t B = {B}Ut Bl U---Ut Bk , 

where t Bi is a fragmentation of Bi. 

We can naturally consider t B as a tree whose vertices are the elements of t# and whose 
edges are defined by the parent child relationship. Considering the properties of such a tree 
leads naturally to the following definition of a fragmentation tree on B. 

Definition 2. A fragmentation tree T on n leaves is a rooted tree such that 

(1) The root ofT does not have degree 1, 

(2) T has no non-root vertices of degree 2, 

(3) The leaves ofT are labeled by a set B with j^B = n. We denote the label of a leaf v 
by£(v). 

The idea of the Gibbs model is that, at each step in the fragmentation the next step is 
distributed according to multiplicative weights depending on the block sizes. We first take 
a sequence {«„}, a n > of weights and a Gibbs weight, which is a function g : Z + — > M + 
with g(0) = 0. Then, for n > 2, define a normalization constant 

k 

Z(n)= J2 a*Iiy# S i)' 

{Bi,...,B h } i=l 

where is over unordered partitions of [n] into at least two elements. Whenever we write a 
formula like this, we assume that each block Bi is nonempty. Now, define the probability of 
a partition of [n] by 



r& 



P°> a {B 1 ,...,B k )=p{#B 1 ,...,#B k 



"fcri 7 =i^(# 5 i 



Z{n) 
The probability of a fragmentation X of [n] is then defined as 

pr(x)= Y[p(B lt ...,B h ), 

Bex 
where Bi, . . . , B^ are the children of B. Using the correspondence between fragmentations 
and fragmentation trees, for T n e T n , we define P^' a {T n ) to be P%' a (X) where X is the 
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fragmentation determined by T n . The probabilistic properties of Gibbs models are studied 
in H21. 



Theorem 2. Suppose that ( satisfies Condition^ with * being nothing and (i = 0. Define 
oik = Cfc and g{k) = k\[z k ]C^(z) . Then Z(n) = g(n) and Q^ = P%' a . Furthermore, given 
a nonnegative weight sequence a and a Gibbs weight g such that Z{n) = g{n), there is a Q 
satisfying Condition^ with * being nothing and Ci = such that Q^ = P^ a . 



Proof. Arguing similarly as in (12. ip . we see that for n > 2 

k 

(n) = ^ 

{B u ...,B k } j=l k=2 "" ( ni ,...,„ fc ) 6 N* v " i ' ■••'••*/ j=l 



k oo /• \ k 

n, \ 1=1 I— o ,„. „ N^ ra fc V 1) ■ • • ; kj . , 



^fe= 



Consequently we have Z(n) = (7(n). Using this, one proves inductively that P% ,a (T n ) = 
Q^(T n ). Furthermore, observe that the condition Z(n) = g(n) implies that there is a weight 
sequence (0)i>o from which the fragmentation model can be derived in the above manner; 
just take Co = g(X)i Ci = 0, and Cfc = ctfc for k > 2. □ 

When we have Z(n) = g(n), the model is called a combinatorial Gibbs model. This is 
justified by the fact that, in this case, Z{n) (and thus g{n)) is the weighted number of trees 
with n leaves. For example, if we let g(n) be the number of fragmentation trees with n 
leaves, and «& = 1 for k > 2, we then see that 

k 

Z(n)= Y, HdWBj). 

{B 1 ,...,B k } j=l 

The right hand side of this equation is just the sum over partitions at the root of a fragmen- 
tation tree with n leavse of the number of fragmentation trees with that partition at the root, 
which is precisely the number of fragmentation trees with n leaves. That is, Z(n) = g(n). 

Note that combinatorial Gibbs models are a generalization of the hierarchies studied in 
[7j and, as previously observed, a special case of the Gibbs models introduced in [12] . 

3. Scaling limits 

We now turn to scaling limits of the models of trees we have been discussing. Fortunately 
for us, the heavy lifting has already been done in [16J. In order to use the results from that 
paper, we must first introduce the formalism required to handle limits of random metric 
measure spaces. 

3.1. Trees as metric measure spaces. The trees we have been talking about can naturally 
be considered as metric spaces with the graph metric. That is, the distance between to 
vertices is the number of edges on the path connecting them. Let (t, d) be a tree equipped 
with the graph metric. For a > 0, we define at to be the metric space (t, ad), i.e. the metric 
is scaled by a. This is equivalent to saying the edges have length a rather than length 1 in 
the definition of the graph metric. More, generally we can attach a positive length to each 
edge in t and use these to in the definition of the graph metric. Moreover, the trees we are 
dealing with are rooted so we consider (t, d) as a pointed metric space with the root as the 
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point. Moreover, we are concerned with the leaves, so we attach a measure p t , which is the 
uniform probability measure on the leaves of t. If we have a random tree T, this gives rise to 
a random pointed metric measure space (T, d, root, px)- To make this last concept rigorous, 
we need to put a topology on pointed metric measure spaces. This is hard to do in general, 
but note that the pointed metric measure spaces that come from the trees we are discussing 
are compact. 

Let M. w be the set of equivalence classes of compact pointed metric measure spaces 
(equivalence here being up to point and measure preserving isometry). It is worth point- 
ing out that M. w actually is a set in the sense of ZFC, though this takes some work to 
show. We metrize M. w with the pointed Gromov-Hausdorff-Prokhorov metric (see j8]). Fix 
(X, d, p, p), (X', d', p, p!) G M. w and define 

d GHP (X,X') = inf inf [8{(f){p),<j)'{p')) V 5 H (<P(X)A\X')) V 8 P {<j>*n,(j)'J)] , 

(M,S) <f>--*->M 
y ' fix'-*** 

where the first infimum is over metric spaces (M, S), the second infimum if over isometric 
embeddings <p and (j)' of X and X' into M, 5h is the Hausdorff distance on compact subsets 
of M, and Sp is the Prokhorov distance between the pushforward <fi*p of /1 by and the 
pushforward <// // of p! by <$' . Again, the definition of this metric has potential to run into 
set-theoretic difficulties, but they are not terribly difficult to resolve. 

Proposition 3 (Proposition 1 in [8]). The space (M w ,dGHp) is a complete separable metric 
space. 

An K-tree is a complete metric space (T, d) with the following properties: 

• For v,w G T, there exists a unique isometry <f> V)W : [0,d(v,w)] with (j) VtW (0) — v to 
(f> v ,w{d{v,w)) = w. 

• For every continuous injective function c : [0, 1] — > T such that c(0) = v and c(l) = w, 
we have c([0,l]) = (/> VtW ([Q,d(v,w)]). 

If (T, d) is an R-tree, every choice of root p G T and probability measure p on T yields 
an element (T, d, p, p) of M. w . With this choice of root also comes a height function ht(t> ) = 
d(v, p). The leaves of T can then be defined as a point v G T such that v is not in [[p, w[[:= 
PjU ,([O, ht(w))) for any w G T. The set of leaves is denoted C(T). 

Definition 3. A continuum tree is an ¥L-tree (T,d,p,p) with a choice of root and probability 
measure such that p is non-atomic, p(C{T)) = 1, and for every non-leaf vertex w, p{v G T : 

[[p,v]}n[[p,w}] = [[p,w]}}>0. 

The last condition says that there is a positive mass of leaves above every non-leaf vertex. 
We will usually just refer to a continuum tree T, leaving the metric, root, and measure as 
implicit. A continuum random tree (CRT) is an (JA W , dcHp) valued random variable that 
is almost surely a continuum tree. 

3.2. The Brownian continuum random tree. Continuous functions give a nice way of 
constructing M-trees. Suppose that / : [0, 1] — > M + is continuous and /(0) = /(l) = 0. We 
can define a pseudo-metric on [0,1] by df(a,b) = f(a) + f(b) — 2min a < i < fe f(t) for a < b. 
Define an equivalence relation by a ~ b if and only if df(a, b) = 0. Letting Tf = [0, 1]/ ~, we 
obtain a metric space (Tf,df). Theorem 2.1 in [B] tells us that (Tf,df) is a compact R-tree. 
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Letting p : [0, 1] — > Tf be the natural map taking a point to its equivalence class, we can 
take p(0) as the root of Tf. We then obtain an element of M. w by equipping Tf with the 
measure pf induced by the pushforward of Lebesgue measure on [0, 1] onto Tf by p. 

Definition 4. Let (e(£),0 < t < 1) be standard Brownian excursion. The Brownian contin- 
uum random tree, denoted T Br , is continuum random tree (T e ,d e , p(0), p e ). 

Note that the elementary properties of Brownian excursion imply that T Br actually is 
almost surely a continuum tree. It is also worth noting that in our formalism the Brownian 
continuum random tree originally defined by Aldous in [3] corresponds to (T 2e , d 2e , p(0), P2e), 
but the convention has since shifted to the one we have adopted here (see e.g. [TTj). 

3.3. Convergence theorems. With this machinery, we can import the following special 
case of Theorem 1 in [TBI. 



Theorem 3 (Theorem 1 in [IS]). Let T be a critical Galton- Watson tree with offspring 
distribution £ such that < a 1 = i>ar(£) < oo. Suppose that for sufficiently large n the 
probability that T has exactly n leaves is positive and for such n let T n be T conditioned 
to have exactly n leaves, considered as a rooted unordered tree with edge lengths 1 and the 
uniform probability distribution px n on its leaves. Then 

[ rp d j rpBr 

where the convergence is with respect to the rooted Gromov-Hausdorff-Prokhorov topology. 

Proof of Theorem^ Corollary [2] provides for the direct conversion of the results of Theorem 
[3] into results about the trees in Schroder's problems. □ 

4. Explicit computations using analytic combinatorics 

The convergence result above is a powerful theorem for obtaining asymptotics of various 
tree statistics, but it is difficult to prove and, as a result, asymptotics thus obtained can 
seem mysterious. Consequently it is worth noting we can obtain a number of asymptotic 
results directly using analytic combinatorics. This analytic approach is based on considering 
the asymptotics of generating functions. The primary source for asymptotics in general is 
[7], which develops the theory with extensive examples. 

Our main goal in this section is to develop the general framework of additive functionals 
for leaf-labeled trees whose size is counted by their number of leaves and use this to find 
the asymptotic distribution of the height of a uniformly randomly chosen leaf. We also 
find the limit of the expected height of a random leaf. These computations are meant to 
be illustrative and by no means exhaust the power of analytic combinatorics framework. 
Indeed, it seems that most of the techniques used to study for simple varieties of trees (see 
[7J for a summary of the extensive work in this area) have close analogs that will provide 
results about the trees we are considering here. 
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4.1. Analytic background. We summarize some of the fundamental results here, but make 
no attempt to prove them. The approach is based on the asymptotics of several universal 
functions. Recall that if f(z) is either a formal power series, [z n ]f(z) denotes the coefficient 
of z n . Similarly, if /C — > C is analytic at then [z n ]f(z) denotes the coefficient of z n in the 
power series expansion of / at 0. 

Proposition 4. Let f(z) = (1 - z) 1 ' 2 , g(z) = (1 - z)- 1 / 2 , and h(z) = (1 - z)" 1 . Then 
[z n ]f(z) ~ -l/2v / ^? ; [z n ]g(z) ~ 1/Vmr, and [z n ]h(z) = 1. 



To use these classical results we need a special type of analyticity called A-analyticity, 
which we now define. 

Definition 5 (Definition VI. I p. 389 |7J). Given two number <p and R with R > 1 and 
< 4> < n/2, the open domain A(</>, R) is defined as 

A{(f),R) = {z | \z\ <R, z^l, |arg(z-l)| > <p}. 

A for a complex number ( a domain D is a A-domain at ( if there exist cf) and R such that 
D = (A((p,R). A function is A-analytic if it is analytic on a A-domain. 

Let 

S= {(1 - z)- a X(zf I a,(3eC\ X(z) = ilog^^. 
k J z 1 — z 

Theorem 4 (Theorem VI. 4 p. 393 [7J). Let f(z) be a function analytic at with a singularity 
at (, such that f(z) can be continued to a domain of the form (A , for a A-domain A . 
Assume that there exist two function a and r, where a is a (finite) linear combination of 
elements of S and r G S, so that 

f(z) = <T(z/0 + O{r{z/C)) asz^C in (A . 
Then the coefficients of f(z) satisfy the asymptotic estimate 

where t* = n a ~ 1 (\ogn) b , ifr(z) = (1 — z)~ a \(z) b . 

Occasionally we will also need to deal with derivatives and the next theorem shows us how 
this is done. 

Theorem 5 (Theorem VI. 8 p. 419 [7]). Let f(z) be A-analytic with singular expansion near 
its singularity of the simple form 

f(z) = J2 Cj (l-zp+0((l-z) A ). 

j=0 

Then, for each integer r > 0, the derivative f^ T \z) is A-analytic. The expansion of the 
derivative at its singularity is obtained through term by term differentiation: 

£/(*) = (-i) r E OT^V - z)a " r + 0((1 - z)A ~ r) - 

dz r ^-^ T (a,- + 1 — r) 

3=0 v 3 ' 
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The generating functions we will work with fall into the smooth implicit-function schema, 
which provides a way to derive coefficient asypmtotics from functional equations. 

Definition 6 (Definition VII. 4 p. 467 [7]). Let y(z) be a function analytic at 0, y(z) = 
E n >o ^n 2 "' with yo = and y n > 0. The function is said to belong to the smooth implicit- 
function schema if there exists a bivariate function G(z,w) such that 

y(z) = G(z,y(z)), 

where G(z,w) satisfies the following conditions. 

(i) G(z,w) = *Yu mn> §gm,nZ m 'wi n is analytic in a domain \z\ < R and \w\ < S, for some 

R,S>0. 
(ii) The coefficients of G satisfy g m<n > 0, g ,o = 0, #0,1 ¥" 1; ana " 9m,n > for some m 

and for some n > 2. 
(iii) There exist two numbers r and s such that < r < R and < s < S , satisfying the 
system of equations 

G(r, s) = s, G w (r, s) = 1, with r < R, s < S, 

which is called the characteristic system. 

Definition 7 (Definition IV. 5 p. 266 [7]). Consider the formal power series f(z) = J^ f n z n - 
The series f is said to admit span d if for some r 

{fn}Zo Q r + dZ + . 

The largest span is the period of f. If f has period 1, then f is aperiodic. 

With this definition, we get the following theorem. It is worth noting that this result 
appears in several places in the literature. We give the version that appears as Theorem 
VII. 3 on page 468 of [7]. In that source it is footnoted that many statements occurring 
previously in the literature contained errors, so caution is advised. 

Theorem 6 (Theorem VII. 3 p. 468 |7J). Let y(z) belong to the smooth implicit-function 
schema defined by G(z, w), with (r, s) the positive solution of the characteristic system. Then 
y(z) converges at z = r, where it has a square root singularity, 



' -, 2rGJr,s) 

y{z) = s — 7v 1 — z/r + 0(1 — z/r), 7= ' 



the expansion being valid in a A-domain. If, in addition, y(z) is aperiodic, then r is the 
unique dominant singularity of y and the coefficients satisfy 

[z n ]y{z) = —L=^r- n (l + 0{n- 1 )). 



2V 



im 



3 



We will also need the following theorem. 

Theorem 7 ((A special case of) Theorem IX.16 p. 709 [7]). Let H(z) be A-continuable and 
of the form H{z) = o — h(l — zj p) 1 ! 2 + 0(1 — z/p) and let k n = x n n l l 2 for x n in any compact 
subinterval of (0, 00). Then 



[z n ]H(z) k - ~ ^"p^-^exp (-Pf) 
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4.2. Restricting the generality. So far we have been considering a very general situation. 
However, in what follows we will be doing computations that are tedious to do in full gen- 
erality. Consequently, we will restrict the generality. In particular, we will let ( = (C«)i>o be 
a sequence of non-negative weights such that Co — 1, Ci = 0> gcd{&; : ( k ^ 0} = 1, and 



G <w = £o-r 



7 
i=2 J 

is entire. These conditions can be relaxed, but doing so makes the analysis more difficult. 

Proposition 5. With ( as above, C^ (defined in Section lKT]) belongs to the smooth implicit- 
function schema with G(z,w) = z + G^(w). Furthermore, in the case where ( corresponds 
to Schroder's third problem (r, s) = (1/2, 1) and in the case of the fourth problem, we have 
(r,s) = (21og(2) - l,log(2)) . Additionally 

iz n ]C c (z)~-^=r- n } 7 
2 vim 6 

Proof. All that really needs to be checked is that the characteristic system has a positive 
solution. For G(z, w) = z + G^(w), the characteristic system is s = r + G^(s) and G'^(s) = 1. 
Using that G^ is entire, G^(0) = and G^(+oo) = +oo, and G'^ is increasing on R + , the 
intermediate value theorem yields s > 0. An easy computation yields that G^(s) < sG'^(s) = 
s, so r > as well. □ 

4.3. The height of a random leaf. Let H n be the height of a randomly chosen leaf from 
a tree in T n . Specifically, to get H n , we choose a tree T n from T n according to P n and then 
choose a leaf uniformly at random from T n . Our main result in this section is the following 
theorem. 




Theorem 8. 



—=H n — > Rayleigh{l), 
'n 



for A = . G'l{s)r. In Schroder's third problem A = l/\/2, and in the fourth problem A = 
v/41og(2)-2. 

Our approach will be that of additive functionals, whose theory we now develop. We 
parallel the development of these functions in [7], p. 457. Their work was done for simple 
varieties of trees whose size was determined by the number of vertices. Here we work with 
trees whose size is determined by the number of leaves. 

For a rooted unordered tree t whose leaves are labeled by B C N, let t G T be the 
tree that results from relabeling the leaves of t by the unique increasing bijection from B to 
{1, 2, . . . , \B\) (where \B\ is the cardinality of B). Suppose we have functions £, 9, ip : T — > M. 
satisfying the relation 

deg(t) 

m = 9{t) + J2 V>(*i), 
i=i 
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where deg(t) is the root degree oft and the {t,} are the root subtrees oft ordered in increasing 
order of the leaf with the smallest label. Letting • denote the tree on one leaf, we note that 
deg(«) = 0, so in particular £(•) = #(•). Define the exponential generating functions 



M 



1*1 



E^n*w ©(*)=£*(*)*(*%' and *(*)=e^*m*) 



t !' 



w 

teT ' ' teT ' ' teT ' ' 

Our results make use of the following lemma, which is a relation of formal power series. 

Lemma 1. Let C^(z) be the exponential generating function for C . Then 
(4.1) E(z) = Q(z)+G' c (C c (z))V(z). 

In the purely recursive case where £ = ip we have 

e(z) 



(4.2) 

Proof. We clearly have 



au 



1 - G' ( (C ( (z)) 



C' c {z)Q{z) 



z \t\ 



deg(t) 



E(z) = Q(z) + ^(z), where *(z) = £ U(*)|7jy E ^) 

teT \ ' '' i=i 



Decomposing by root degree and using that £(•) = #(•), we have 

^|*l|+-+|*r| 



*(*) = E E c,n w fe 



r>l deg(i)=r i=l 

E 



(|ti| + --- + |t r |)! 



(V(t X ) + ---+V(tr)) 



r>l 



E 

r>l 

E 



(ti,...,t r )er r i=i 



z |tl|+-+|tr| 



ti|,...,|t r | y (|ti| + --- + |t r |)! 



(V(tl) + ---+V(*r)) 



£ _ _I_ Jtl|+-+|tr| 

S E n^fe)^ — urrWiH HWr)) 



(ii,...,t r )6T r i=i 



tl • • • t 



c- 



Mzy-^iz) 



r>l v 7 

= G' c (C c (z)Mz). 

This yields (14. ip . In the recursive case, we have E(z) = Q(z) + G'^(C^(z))E(z). Solving 
for 5(2) gives the first equality in (j4.2p . To get the second, we differentiate (12. ip to get 
C[(z) = l + G[(C^z))C[(z). Solving for C' c {z) gives C' c (z) = 1/(1 -G£(C c (z))), from which 
the second equality in (14.21) is immediate. D 

Two immediate applications are to counting the weighted numbers of leaves and vertices 
of a given height. 

Theorem 9. The expected number of leaves at height k converges to G'Us)rk and the expected 
number of nodes at height k converges to sG'Us)k + 1. 
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Proof. Let £&(£) be the number of leaves of height k in t, so that £, k (t)w(t) is the weighted 
number of leaves of height k. Define E k = Y^t ^k(t)vj(t)z^/\t\\. For k > 1 we apply the 
lemma with £ = £%, 9 = and ^ = £fc-i to obtain 

Z k (z)=G' ( (C c (z))~ k _ 1 (z), 

which easily yields 

E k (z) = [G' ( (C c (z))] k So(z) = z [C c (C c (z))] fc . 

Letting Afc(^) be the generating function for the weighted number of vertices of height k we 
similarly get 

A k (z) = [G' ( (C ( (z))] k A (z) = C c (z) [G" c (C c W)] fe . 

Using these forms, we are able to compute asymptotics. Expanding G'^ about s, we have 
that G'^(z) = 1 + G"(s)(z — s) + 0((z — s) 2 ). Plugging in the asymptotic expansion of C^ 
we get from Proposition \5\ and Theorem O and doing some algebra, we have 



(4.3) G' c (C c (z)) = 1 - G'^^l^Jfr + 0(1 - z/r). 

Hence, using that (1 — z) k = 1 — kz + 0(z 2 ), we see that 

[G' ( (C c (z))] k = 1 - G'i(s)kiy/T=Jfi + 0(1 - z/r). 
Thus, using Theorem HJ we have 

[z n }E k (z) = [z n ]z [G' c {C c {z^ h ■ *■ 



n~~i 



2r n ~ l yTrn? 
and, similarly with a bit more algebra, 

k 7(skG'l(s) + 1) 



[z n }A k (z) = [z n }C c (z) [G' c (C c (z))]' 



2r n y/Tin 3 



Using the result on p. 474 of [7], that [z n ]0^(2;) ~ r y/2r n Vnn 3 , we find that 



ni\z n ]a k [z) 



E rSt,k) = | n « ~ G'l(s)rk. 

n\[z n \C^{z) s 

Letting ( k : T — > TL be the number of nodes of height k in t, we have that 



n\[z n ]A k {z) 



E rSCk) = I - " ~ sG'(s)k + 1. 



The proof of Theorem [8] is similar, but we make use of Theorem [7] for the asymtotics. 



□ 



Proof of Theorem^ Let {k n } be a sequence of integers varying such that ck n /n 1 / 2 — > x G 
(0, oo) for some c > 0. By Theorem [7] and equation (I4.3[) we see that 



G"(s) 7 / G'l(s) 2 j 2 k 

(4.4) [z n }[G' c (C c (z))} k - ~ ^^7=^xp {- C ^J 
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Therefore 
This yields 



^r„ (6J ~ G' c / (s)rfc„_ 1 exp 

1 4(n — lj 



/ G'Us)rkl A 



Note that P(H n = k) = Eq- n {^)/n. Observe that {/c n } satisfies the hypotheses of the above 
theorem. Consequently, we have that 



c Vv n V n J c 

In V San / 



^^ (g)r ^7^ eXP V 2c^ (n-1) 
G'l(s)r ( G'l(s)r 2 



? — — xex v[ — ^ :r 

The proof is finished by an application of a standard corollary of Scheffe's theorem (see 
Theorem 3.3 in jl] for an idea of the proof, just adapted for a distribution on (0, oo)) and 

choosing c = JG'Us)r. □ 

In addition to proving convergence in distribution we can prove convergence of the first 
moment. 



Theorem 10. E Tn H n ~ J^rr^n 1 ' 2 



The approach is to first compute the expected sum of the heights of the leaves of a tree. 



Theorem 11. Let(j)(t) be the sum of the heights of the leaves oft. Then Ej- n (j) ~ / Z,, ? n 



JL r,3/2 



2rG"(s) 



Proof. Observe that 

dcg(T) 

<Kt) = 1*1 + E ^)- 

Let <&(z) be the exponential generating function associated with <fi. Applying Lemma [TJ we 
have 

*(*) = z(C>(z)) 2 . 
By Theorem [5] we have 

Ci(z) = l.(l-z/r)- 1 " + 0(l). 
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Consequently, 

{C' c {z)f = ^(1 - z/r)- 1 + ((1 - z/r)-^). 



Therefore 



n\[z^}{C' c {z)f ±^ V? 3/2 



n![^]C c (z) db^"" 2 ^ 



71 n 3 / 2 
2rG" c '(s) 






D 




D 



2v7rn ;1 



Proof of Theorem\Wi Simply observe that Ej- n H n = -Eq- n (j). 

References 

[1] David Aldous. The continuum random tree. I. Ann. Probab., 19(l):l-28, 1991. 

[2] David Aldous. The continuum random tree. II. An overview. In Stochastic analysis (Durham, 1990), 
volume 167 of London Math. Soc. Lecture Note Ser., pages 23-70. Cambridge Univ. Press, Cambridge, 
1991. 
[3] David Aldous. The continuum random tree. III. Ann. Probab., 21(l):248-289, 1993. 
[4] Patrick Billingsley. Convergence of probability measures. Wiley Series in Probability and Statistics: 
Probability and Statistics. John Wiley & Sons Inc., New York, second edition, 1999. A Wiley-Interscience 
Publication. 
[5] Michael Drmota. Random trees. Springer WienNew York, Vienna, 2009. An interplay between combina- 
torics and probability. 
[6] Thomas Duquesne and Jean-Frangois Le Gall. Probabilistic and fractal aspects of Levy trees. Probab. 

Theory Related Fields, 131(4):553-603, 2005. 
[7] Philippe Flajolet and Robert Sedgewick. Analytic combinatorics. Cambridge University Press, Cam- 
bridge, 2009. 
[8] Benedicte Haas and Gregory Miermont. Scaling limits of Markov branching trees, with applications to 

Galton- Watson and random unordered trees. arXiv:1003.3632v2, 2010. 
[9] Benedicte Haas, Gregory Miermont, Jim Pitman, and Matthias Winkel. Continuum tree asymptotics of 
discrete fragmentations and applications to phylogenetic models. Ann. Probab., 36(5):1790-1837, 2008. 
[10] V. F. Kolchin. Branching processes, random trees, and a generalized scheme of arrangements of particles. 

Mathematical Notes, 21:386-394, 1977. 10.1007/BF01788236. 
[11] Jean-Frangois Le Gall. Random real trees. Ann. Fac. Sci. Toulouse Math. (6), 15(l):35-62, 2006. 
[12] Peter McCullagh, Jim Pitman, and Mathias Winkel. Gibbs fragmentation trees. Bernoulli, 14:988-1002, 

2008. 
[13] Yu. L. Pavlov. The asymptotic distribution of maximum tree size in a random forest. Theor. Probab. 

AppL, 22(3):509-520, 1978. 
[14] Yu. L. Pavlov. Limit distributions of the height of a random forest. Teor. Veroyatnost. i Primenen., 

28(3):449-457, 1983. 
[15] Jim Pitman. Enumerations of trees and forests related to branching processes and random walks. In 
Microsurveys in discrete probability (Princeton, NJ, 1997), volume 41 of DIMACS Ser. Discrete Math. 
Theoret. Comput. Sci., pages 163-180. Amer. Math. Soc, Providence, RI, 1998. 
[16] Douglas Rizzolo. Scaling limits of Markov branching trees and Galton- Watson trees conditioned on the 

number of vertices with out-degree in a given set. arXiv:1105.2528vl, 2011. 
[17] E. Schroder. Vier combinatorische probleme. Z. Math. Physik, 15:361-376, 1870. 

[18] Richard P. Stanley. Enumerative combinatorics. Vol. 2, volume 62 of Cambridge Studies in Advanced 
Mathematics. Cambridge University Press, Cambridge, 1999. 



SCHRODER'S PROBLEMS AND SCALING LIMITS OF RANDOM TREES 19 

Department of Statistics, University of California, Berkeley, CA 94720 
E-mail address: pitman@stat.berkeley.edu 

Department of Mathematics, University of California, Berkeley, CA 94720 
E-mail address: drizzolo@math.berkeley.edu 



